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Abstract. We are concerned with Nehari's theorem on Hardy space on a polydisk. Define 
'little' Hankel operators on product Hardy space ff 2 (C+) by 

dcf _ 

H b ip = P e M b ip . 

where P e is the orthogonal projection from L 2 (R d ) to H 2 (C+) and M& is the operator 
of multiplication by b. We present the proof of Ferguson and Lacey [27] and Lacey and 
Terwelleger [38] that we have the equivalence of norms 

\\ H b\\ - — |HIbMO(C£) 

for analytic functions b. Here, BAIO(C+) is the dual to iJ 1 (C < ^) as discovered by Chang 
and R. Fcffcrman. This article begins with the classical Nehari theorem, and presents the 
necessary background for the proof of the extension above. The proof of the extension is 
an induction on parameters, with a bootstrapping argument. Some of the more technical 
details of the earlier proofs are now seen as consequences of a paraproduct theory. 
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I. Introduction 

These notes concern the subject of Nehari's theorem, on Hardy space of the disk, and 
products of the disk. The theorem on the disk is classical, with three different approaches 
possible; the same question on products of the disk, the polydisk of the title, is a new result 
of the author, Sarah Ferguson and Erin Terwelleger [27,38]. The proof in the product setting 
is much more complicated, with currently only one proof known. It relies upon a delicate 
bootstrapping argument with an induction on parameters. These elements are suggested 
by the harmonic analysis associated with product theory, as developed by S.-Y. Chang, 
R. Fefferman and J.-L. Journe [6-10,35,36]. These notes will provide an approach to this 
result that is more leisurely than the research articles on the subject. We in particular include 
a great many references, and a description of related results and concepts. The proof of the 
main theorem we give is a little more 'structural' in that the main technical estimates are 
seen as consequences of a theory of paraproducts. 

The key concepts of this paper concern the intertwined topics of Hankel operators, Hardy 
space, Hilbert transforms, commutators, and paraproducts. Let us describe Hankel matrices. 
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Consider a function b on L 2 (T), and the operator of pointwise multiplication by b. That 

is, Mb(f = f b • if. Now, L 2 (T) has the exponential basis. We view the circle as embedded 
in the natural way in the complex plane, so that a relevant basis is {z n : n G Z}. The 
decomposition of functions in this basis of course generates the Fourier transform: 

/(n) = f f{z)z- n \dz\ 

In the exponential basis, M& has a matrix form, 

M b ^{b{i-j) : i,jeZ}. 
Restrictions of this matrix give Hankel and Toeplitz operators. 

The restriction of the matrix to the upper quadrant N x N is a Toeplitz matrix. Namely, 
T = {tij : i,j G N} is a Toeplitz matrix iff Uj = a^j for some numerical sequence a on Z. 
In this note, we are principally interested in the boundedness properties of operators. It is 
easy to see that Toeplitz matrix is bounded iff the sequence otj are the Fourier coefficients of 
a bounded function. How this statement changes for Hankel matrices occupies our attention. 

Restricting the matrix to say the quadrant N x (— N) gives a Hankel matrix. Namely, 
H = {hij : N} is a Hankel matrix iff = cti+j for some numerical sequence on N. 

In passing to these restrictions of the matrix for we are implicitly restricting the matrix 
on £ 2 (Z) to one on £ 2 (N). Namely, a Hankel and Toeplitz matrix are operators on £ 2 (N). The 
natural analog in L 2 (T) is Hardy space H 2 (T). By definition, H 2 (T) = ifi(T) is the closed 
subspace of L 2 (T) generated by {z n : n > 0}. It is natural to call these functions analytic, 
as / G H 2 (T) admit an analytic extension to the disk D given by 

F(z) = J2f(n)z n - 

n>0 

Functions in if 2 (T) = L 2 (T) H 2 (T) are referred to as antianalytic. 

Let us describe the Hankel operators on H 2 (T). Let P± be the orthogonal projection from 
L 2 (T) onto the subspace H±(T). A Hankel operator with symbol b is an operator H& from 

H+(T) to if 2 (T) given by H?,^ = f P_ M^^. It is clear that this definition only depends on 
the analytic part of b. 

1.1. Remark. The placement of the conjugate symbol is somewhat arbitrary, and is adopted 
in this way only for convenience. Richard Rochberg avoids such complications by defining 
Hankel operators as bilinear operators B from if 2 x H 2 into C, which are linear on products: 
B((/), ip) = L(ip ■ ip) for a linear functional L. 
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A central operator on £ 2 (N) the shift operator S(a Q , cti, . . .) = (0, a , cti, . . .). Hankel 
operators H are distinguished by their intertwining with the shift operator: 

HS = S*H . 

Proof is left for the reader. 

The shift operator on Hardy space H 2 (3) is given by multiplication by z, and Hankel 
operators on Hardy space enjoy the same intertwining with the shift operator. 

While we have taken pains to outline these initial observations on the integers and the 
circle, there is an equivalent formulation on the real line. To be specific, on L 2 (R), we have 
the Fourier transform 

f(0 = I f(x)e~^ dx. 
Define the orthogonal projections onto positive and negative frequencies 

P±f(x) = [ f(C)^ X dx. 
Jr± 

Define Hardy spaces H±(M.) == P± L 2 (R). Functions / £ if^(R) admit an analytic extension 
to the upper half plane C+. As in the case of the disk, it is convenient to refer to functions 
in i7i(R) as analytic. 

A Hankel operator with symbol b is then a linear operator from iJ^(R) to H 2 _ (R) given by 
H;,(y9 = f P_ Mjjlp. This only depends on the analytic part of b. 

It will be convenient to consider some of our proofs in the setting of the real line. For, 
while it is equivalent to work on any of the three settings, the real line has a natural dilation 
structure which simplifies certain aspects of the argument. 

Acknowledgment. These notes were prepared while in residence at the University of British 
Columbia, for a conference "Harmonic Analysis at Sapporo, Japan" held in August 2005. 

2. Wavelets, BMO(R) and Paraproducts 

Ultimately, we are interested in characterizations of BMO. This class of functions have 
delicate properties, sensitive to locations in both time and frequency. Wavelets turn out to 
be very useful in analyzing their behavior. We recall some basic facts about two distinct 
classes of wavelets, namely the Haar and Meyer wavelets. 



Throughout this paper, T> denotes the dyadic grid. Thus, 
(2.1) V d ^ {[j2 fe ,(j + l)2 fc ) : j,keZ} 
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Define translation and dilation operators by 

(2.2) Tr y f(x) = f(x-y), yeR, 

(2.3) Dil* f(x) = s- 1 ' p f{x/s) , < s, p < oo , 

(2.4) Dil? f(x) = Tr c(/) Dil[ 7| f(x) , I is an interval . 

In the second definition, s denotes the scale of the dilation, and the normalization is chosen 
to preserve L P (M) norm. In the last definition, we extend the definition of dilation to an 
interval, which incorporates a translation to the center of /, denoted c(I), and a dilation by 
the scale of /. 

2.1. Haar Wavelets. The notations h = h° and h 1 are reserved for the functions 

h = h = —1 [0,1/2) + 1 [1/2,1] > h 1 = l[ ,i] . 

Here, the superscript means that that the function has mean zero, while the superscript 1 
means that the function does not have mean zero. We will use these two definitions in our 
discussion of paraproducts; the function h° is used most of the time, and we will frequently 
suppress the superscript when using these functions. 

For any interval I, we can define h\ == DiL? /i e . The Haar wavelets are then given by 
{h°j : I G T>}. These functions are an orthogonal basis on L 2 (R), which extend to an 
unconditional basis on L P (IR) for 1 < p < oo. In particular, the Littlewood Paley inequalities 
become 



2.5. Theorem. We have the estimates 



I IZ^ — m — lj 

lev 



1/2 



1 < p < oo . 



More generally, if ip is adapted to [0, 1] and has mean zero, we have the square function below 
maps L p into itself for all 1 < p < oo. 

(2.6) S/ d = f [£ 1 %^] 1/2 - 

lev ' ' 

One natural extension of these estimates to the cases of p = 1 and p = oo can be taken as 
the definitions of Hardy space H l and BMO. 

We should also mention that 

(2.7) Mdy /(x) = sup^=^ 

lev 



is the dyadic maximal function. It maps L p into itself for all 1 < p < oo. The proof of 
this theorem can appeal to probabilistic methods. Indeed, the expansion of a function in 
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a Haar basis is a martingale. This fact lies behind the very successful application of Haar 
functions to establish a range of deeper properties of singular integrals, including the Hilbert 
transform. These properties include the UMD theory of Burkholder [3] and Bourgain [2] ; 
matrix valued paraproducts, as discussed in articles by a range of authors [46,47,51,58,59] 
and the Nazarov Treil Volberg extension of the Calderon Zygmund theory [49,50], as well as 
their discussion of the Bellman function approach [48,52]. 

More generally, the definition of the maximal function is 

(2.8) Mf(x) = sup(2t)~ 1 / f(x-y)dy 

t>o J-t 

where it is essential that we do not impose absolute values inside the integral. Define the 
Hilbert transform 

(2.9) H/(x) = -P_ + P + /(x)=p.v.i J f(x-y)^-. 

2.2. Meyer Wavelet. Y. Meyer [42] found a Schwartz function w, with 

(2.10) w is supported on 2ti < |£| < 87r, 

and the functions {wj : I £ T>} form an orthonormal basis for L 2 (M). Here, we use the same 
notation as in the case of the Haar basis, wj = Dil^. 

As with the Haar basis, these extend to an unconditional basis on L P (R), for 1 < p < oo. 
As concerns the Hilbert transform, observe that 

H/ = ^(j> 7 > H^, 
lev 

and that the functions Hw/ = Dilj(Hu>) have the same decay and Fourier localization prop- 
erties of the Meyer wavelet. 

Similarly, if / £ H 2 (W), we have 

/=p + / = X)</»«*>«" 

lev 

= P + [j2(? + f,wi)w I 

lev 

= f> w *) p + w ' 

lev 

Therefore, {P + wi : I £ V} is a basis for H 2 (R). 
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2.3. Hardy Space H]_(T) and BMO. For 1 < p < oo, the Hardy space Hp(T) is the 
closure in L P (T) norm of the span of the exponentials {z n : n G N}. A function / G H P (T) 
has an analytic extension F to the disk. Indeed, this extension is 

oo 

F(z) = J2f(n)z n . 

n=0 

and the H P (T) norm can be taken to be 

||/||ffp(T) = f sup \\F(rz)\\ LP{T ) 

0<r<l 

Concerning the space if 1 (T), the following classical property is central to us. 

2.11. Proposition. Each function f G if 1 (T) is a product of functions /i,/2 G H 2 (T), in 
particular, f\ and f'2 can be chosen so that 

11/11^ = ll/iMIMI^ 

2.12. Remark. In the product setting to which we turn to next, this last property fails. Indeed, 
part of the interest of our results is that while simple factorization will fail, a notion of weak 
factorization is in fact true. 

2.13. Remark. The investigation of the failure of factorization is an intricate one. In the 
product setting, Rudin [64] proved the failure of factorization in the case of © d for d > 4; 
Miles [43] improved the result to d > 3; Rosay [63] in the case of d = 2 showed that the set of 
functions in if 1 (D <g> D) which factor is of the first category. This problem is also of interest 
in the Bergman space setting. See [29,31,32]. See [29] for information about this question in 
other spaces of analytic functions. 

We are especially interested in the Hardy space if^IR). It is technically easier to discuss 
the real Hardy space Re(if 1 ) consisting of the real part of functions in H 1 . 

2.14. Theorem. We have the equivalence of norms 

ll/llRe^^ll/lli+llH/llr^llS/II^HM/ll! 

Here S is as in (2.6). 

Any standard reference in the subject will include a proof of this result. Historically, this 
kind of characterization was an essential precursor to the proof of Fefferman and Stein of H 1 
and BMO duality. 

Indeed, one can use this characterization of H l in terms of atoms (which we don't define 
here) which lead immediately to 
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2.15. Theorem (i7 1 (]R) — BMO(R) duality). The dual ofRe(H l (R)) is BMO(R) with norm 



def 

BMO(M) — SU P 

J is an interval 



lev 

ICJ 



~ sup 

J is an interval 



1/2 



where [ij = \ J\ 1 fjf(y) dy. 



The second definition has the advantage of being intrinsic to /, but has the disadvantage 
of not having a suitable generalization to higher parameters. Thus, we have stressed the first 
definition in terms of wavelets. There is nothing special about the Meyer wavelet appearing 
here. It can be replaced by any wavelet, including the Haar wavelet in this context. 

One nice feature of the Meyer wavelet, is that if we replace the functions wr by their 
analytic projections, we obtain a completely analogous definition of analytic BMO, the dual 
toHHT). . 



2.4. Paraproducts. Paraproduct is the term used to refer to any of a wide variety of objects 
that are a variant of a product of two functions. A Hankel operator is one such example, but 
our purpose in this section is to describe a class of more naive examples. 

Consider the operation M&/, where we take both b and / to have finite Haar expansion 
on the real line. 

I,J€T> 

Restrict the sum above to I C J, and observe that 

Para H aar(&, f)=/2 ^' MCA hj)hi ■ hj 

(2.16) 



i,Jev 

ICJ 



lev 

Here, we are appealing to this property of Haar functions: 

JHJ = 0, JC/, 



(hj, h)) 




ICJ. 



The point of this is that while the operator norm of M& is ||6||oo> while the norm of the 
operator in (2.16) is in general somewhat smaller. ParaHaar(&; •) is a paraproduct. 
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Paraproducts admit a more general definition, which is the point of this definition. For an 
interval /, we say that tp is adapted to I iff \\tpW2 — 1 and 

(2.17) |D>(x)| < + li^p)"", n = 0,l. 

Here, c(J) denotes the center of J, and N is a large integer, whose exact value need not 
concern us D denotes the derivative operator. We shall consistently work with functions 
which have L 2 norm at most one. Some of these functions we will also insist to have integral 
zero. 

We take {<p e j : I G V}, e G {0, 1} to be functions adapted to I G T>. The functions ip® are 
of mean zero. Then, set 

Para(6,/)^ j:<^</,^S 



lev 

2.18. Theorem. For 1 < p < 00, we have the inequality 

(2.19) ||Para Haax (6,-)IU P < ll&l|BMO dy , 
where the last norm is dyadic BMO norm defined by 

(2.20) ||6|| B Mo dy = Bupfl/r^KMr)! 2 " 1 ' 



For the operators Para, we have 

||Para(6,-)IU P <||6||BMO, 

It is essential in this formulation that we do not let the function h], which does not have 
mean zero, fall on the BMO function b. This point of view is very helpful in obtaining upper 
bounds for other commutators, Hankel operators and related objects. 



It is useful to observe that paraproducts can arise in a wide variety of forms, in particular, 
the classical approach of Coffman and Meyer relies upon the 'P t -Q t ' formalism. A form 
useful to us is as follows. For the Meyer wavelet w, with it's antianalytic and analytic parts, 
respectively u and v, let us set 

(2.21) AXJj = u^m 

lev 
\l\=v 

(2.22) XJj d ^ ^AU fc 

The following Theorem concerns a paraproduct which is quite close to being a Hankel oper- 
ator; it's bounds £1X6 £L corollary to the previous Theorem. 
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2.23. Theorem. We have the estimate 



(2.24) |^(AU^)-(U> 

j&Z 



2 ~ ||&||bmo(k)|| < ^||2 



Proof. Let us first consider terms like 

^AUj-AUj+fc, 0<£;<8. 

We take e.g. \k\ < 8 due to the compact frequency support of the Meyer wavelets, as will be 
come clear momentarily. The assertion is that each of these is a bounded operator, provided 
b e BMO(R). 

We control this expression in a brute force method, which we will appeal to twice. Fix k 
and a map n : V — >Pso that \ir(I)\ = 2 k I and (A - < dist(7, J) < A\I\, for some 
fixed integer A. 

Write 

, def /TJT 

Wi = V Kl i ' u Ai) 

Observe that A 100 ipj is adapted to / with constant independent of A (and we certainly do not 
assert that it has integral zero!). This is the only observation we need to make to conclude 
that 

"v^ (b,U!)- 



< A~ 100 



-iiBMO(R) WW 2 



This is summed over the different values of A and n to conclude the estimate 
II^AUfft- AXJ j+k ip, < ||&|| B mo(r)IM|2, 0<£;<8. 



Associated with the Meyer wavelet is a 'father wavelet,' a function W of non zero mean, 
for which w = W — Dil}^ 2 W. Using this, we see that 

U;/= Y,(f> W ') Wl - 

\I\=2i 

We then have 

^)(AU i 6).U7^= Yl ^M(P+^Wj)Vlui-Wj. 



2»\I\<\J\ 

Observe that in contrast to the previous case, we are taking inner products (P+ /, Wj) and 
Wj will have a non zero mean. 
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Nevertheless, this last sum can be controlled by the brute force method used above, with 
this observation. Take two dyadic intervals / and J with 2 9 |/| = \J\ and (A — < 
dist(J, J) < A\I\, for integer A. Then, 

A 100 Vl ui ■ Wj 

is adapted to / with constant independent of A, and has integral zero. The reason for this 
stems from (2.10). The Fourier transform of the product is supported in the convolution of 
the supports of the Fourier transforms of the two functions. Since / has the smaller scale, the 
Fourier transform of Ui is supported a very great distance from the origin, hence the Fourier 
transform is zero at the origin. This completes the proof. □ 



3. The Nehari Theorem on the Disk 



The classical result that we are interested in is: 

Nehari Theorem ([53]). The Hankel operator Hf, from H+(T) to H+(T) iff there is a bounded 
function (3 with P + b = P+/3. Moreover, 

||H 6 || = inf H/3IU 

There are three proofs of this fact in the literature. In the new results, we will need to rely 
upon methods from two of these methods. 

Factorization. Given a bounded Hankel operator Hf,, we want to show that we can construct 
a bounded function (3 so that the analytic part of b and f3 agree. 

This proof is the one found by Nehari [53]. We begin with a basic computation of the 
norm of the Hankel operator Hf,: 

||H&|| = sup sup / H fe ^ -Tp dx 

IMIh2(T) =1 IMI H 2(T) =1 J 



sup sup / P + Mb ip ■ <p dx 



(3.1) 

= sup sup / (P + b)ifj ■ ip dx 
= sup sup ((P + b),ip ■ if) 

\M\h 2 + U) =1 WWh* (T) =1 

But, the if x (T) = H 2 (T) ■ H 2 (T), as we recalled in Proposition 2.11. We read from the 
equality above that the analytic part of b defines a bounded linear functional on /J 1 (T) a 
subspace of L 1 (T). 
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The Hahn Banach Theorem applies, giving us an extension of this linear functional to all 
of L l , with the same norm. But a linear function on L 1 is a bounded function, hence we have 
constructed a bounded function f3 with the same analytic part as b. 

Duality. In this proof, the H 1 — BMO duality is decisive. The calculation (3.1) shows that 
P + b is a bounded linear functional on H 1 . Therefore, we have 

||H 6 || ~ ||P+6||bmo 

(This is not equality, since we are not choosing to adopt a canonical norm for BMO.) In 
addition, we have BMO = L°° + HL°°, where H is the Hilbert transform. Therefore, we can 
select (3 G L°° which has the same analytic part as b. 

3.2. Remark. Historically, this proof came last; it depends critically upon the Fefferman Stein 
i^-BMO duality, which was not established until the 1970's, see [22]. 

The AAK Method. Adamjan, Arov and Krein [1] invented a method based upon an dilation 1 
property of operators on Hilbert space. This method avoids the finer aspects of Hardy spaces. 

We are given a Hankel matrix which is bounded from £ 2 (N) to itself, and we seek to extend 
it to a bounded matrix on £ 2 (Z), with the same norm. This is an inductive process, with the 
first step being that we seek to add, say, a row to the 'top' of the matrix: 

' * ao ai . . .' 
a a x a 2 ... 
ai a 2 a 3 ... 

0,2 O3 O4. . . . 

Namely, we seek to choose a value of * to put on the upper left hand coordinate so that the 
two matrices have the same norm. 

This in fact can be done, and leads to the following Proposition. 

3.3. Proposition. Consider two Hilbert spaces Q and 7i, and consider linear operators from 
Q ®Ti, into itself of the form 

rx c 

[A B 

where X : Q — ► Q; A : Q — > H; B : 7i — ► Q; and C : 7i — ► Q . We presume that A, B, C 
are prescribed in advance. Then we can select X so that 

||U|| = max{||A|| , ||S|| , ||C||}. 

1 This dilation property is distinct from the dilation property of the real line used in other parts of this 
paper. 



CLq (X\ CL2 
(1\ (X2 &3 

a 2 a 3 a 4 
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Given a Hankel matrix H = {a J+ fc : j, k £ N}, we apply the proposition above with 
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A 



a 
ai 
a 2 



B 



CLl Q 2 O3 

a 2 a 3 ... 



C = [a ai a 2 . . .] 



By the proposition, we can choose a_i so that the norm of H is the norm of 



a-i 






do 




a 2 


(i 1 


«2 


03 



By induction, we can extend the Hankel operator to i 2 {1) to itself, as a bounded operator 
with the same norm. The conclusion of Nehari's theorem then follows. 

3.4. Remark. This method has found many deep extensions to Hankel matrices whose entries 
are themselves operators. We refer the reader to Nikolski [54], as well as Nikolski [55] and 
Peller [57] for very interesting discussions of the method of Adamyan, Arov and Krein. This 
is relevant for us, as in the extension to the product setting, we consider Hankel matrices 
whose entries are also Hankelian. Cotlar and Sadosky have studied extensions of this method 
to the polydisk in a sequence of papers [13-16, 18, 19] 

3.1. Commutators. Commutators are very useful in measuring, in a quantitative way, the 
distance from being abelian. They are relevant for us as the Nehari theorem has an equivalent 
formulation in terms of commutators of multiplication operators and the Hilbert transform. 

3.5. Theorem. We have the equivalence 

||[M 6 ,H]|| 2 ^ 2 ~||6||bmo 
Here, BMO = (ReH^T))* is real BMO. 



The proof in this circumstance is immediate: Observe that P T [M;>, H]P± is itself a Hankel 
operator, or a conjugate of a Hankel operator. Specifically, 

P+[M 6 ,H]P+ = 0, P_[M 6 ,H]P_ =0, 

P+[M 6 ,H]P_ = -P+M fe P_, P_[M 6 ,H]P + = P_M 6 P + . 

and the last two operators are orthogonal, and Hankel operators. 

The Theorem admits many extensions. For instance, we continue to have the equivalence 



|[M b ,H] 



I BMO 



1 < p < OO . 
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Indeed, assuming the commutator with symbol b is bounded on L p , the same is true on the 
dual index L p . But then, by interpolation, the commutator is bounded on L 2 , and we can 
appeal to the Theorem above to deduce that the symbol is in BMO. 

The upper bound on the LP norm of the commutator follows by considering the Hankel 
operator P_ Mj,P + . Using the calculation in (3.1), we can see that 

||P_ MfcP+Hp-yp = sup sup / (P_ b) • (pip dx 
ipeHP ^ eHP ' J 

But, H 1 = H p ■ H p ' , so this last quantity is ||P_ 6||bmo 

One direction in which this result extends is for the commutator to characterize a broad 
array of function spaces. The genesis of this theme is the very interesting article of Coif- 
man, Rochberg and Weiss [12], which consider the instance of commutators of M& and Reisz 
transforms. 

Subsequently, it turns out that one has the equivalence 

||[M b ,H]|U_^||6|U 

for a range of spaces X, Y, and function spaces Z. Various Lipschitz classes can be charac- 
terized this way; further generalizations can be stated in terms of various Besov and Treibel 
Lizorkin spaces. Moreover, the Hilbert transform can be replaced by other operators, such 
as the fractional integral operators. There is a significant literature here, of which we cite 
Chanillo [11], Cruz-Uribe and Fiorenza [21], as well as the references in the article of the au- 
thor [40] which is a first step in extending some of these results to higher parameter settings. 

Another direction is to abandon the chance of characterizing function spaces, replacing the 
Hilbert transform by, say, a Calderon Zygmund operator T. The method of choice in such 
generalizations is the use of the sharp function: 

([M 6 ,T]/)«<M/, 

where on the right we intend M to be an appropriate maximal function. This method was 
(to the best of my knowledge) first used on this problem by Coifman, Rochberg and Weiss 
[12], and since then has been used by a wide variety of authors. 

This method has difficulties in being generalized to the product setting; we ask the reader's 
patience for not defining precisely what the sharp function is. 2 



2 A paper of R. Fefferman [23] indicates a certain extension sharp function to the two parameter prod- 
uct setting. The difficulty of a similar extension to a three parameter setting centers around the tenuous 
relationship between rectangular BMO and BMO in three and higher parameters. See however [66]. 
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There is an alternate method, which is of interest as it highlights the role of paraproducts 
in these questions, and permits a generalization to the higher parameter setting. See the 
author's paper [40]. 

3.2. Paraproducts and Commutators. We describe how to write the commutator [M&, H] 
as a sum of two paraproducts. From this, appropriate upper bounds on the norms of the 
commutator can be given. In order to keep the exposition as simple as possible, we will rely 
on particular properties of the Hilbert transform. Yet, the method is flexible, and can apply 
to a wide variety of operators. 

We return to the Haar basis on the real line. For convenience, set 



Here, we are using obvious notation to refer to the left and right halves of the dyadic interval 
/, which again are dyadic. And define an operator by G / = Y^ieT>(f> hi)9i- 

Now, it is well known that the Hilbert transform is nearly diagonalized in a wavelet basis. 
Yet, since the Hilbert transform has odd kernel, it is is not appropriate to decompose with 
a 'even' choice of basis. Clearly, gj is an 'odd' version of the Haar function hi, so G is a 
bit like the Hilbert transform. Of course it lacks translation and dilation invariance. It is a 
very nice observation of S. Petermichl [58] (also see [59]) that these are the only properties 
missing. Namely, we have 

3.6. Proposition (Petermichl [58]). The operator below is a non zero multiple of the Hilbert 
transform. 



exists for Schwartz functions tp. Moreover, as G is a bounded operator on L 2 , we conclude 
that A is also bounded on L 2 . 

Due to the limiting procedure, one sees that A is a translation invariant operator. The 
average over dilations is taken with respect to Haar measure for the dilation group, hence A 
is also invariant with respect to dilations. It is a classical fact that a bounded linear operator, 
invariant with respect to translations and dilations is a linear combination of the Identity 
operator and the Hilbert transform. 



9i = -h lM + h tl 



right ? 



lev. 



(3.7) 




Proof. Observe that the limit 




Clearly, Al = 0. That is, A is a multiple of the Hilbert transform. And so we should check 
that it is a not the zero operator. But one can check directly that A applied to the Dirac 
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measure at the origin is 

A5 (x) ~ a; -1 . 

Hence it is a multiple of the Hilbert transform. □ 

3.8. Proposition. The commutator [M&, G] can be written as a linear combination of para- 
products, or paraproducts composed with G. In particular, the commutator is bounded on L 2 , 
when b 6 BMO. 

Proof. We use the notation ip ® tp to denote the rank one linear operator 

i> <g> <p(f) = if>(<p, f) . 

We will expand the symbol b in the Haar basis. G is an explicit sum over rank one operators 
as above, and we will make an computation of commutators for Haar functions. As such, it 
is convenient to split the operator G into Gi e ft and G r i g ht, where we define 

Gleft = ^left ® k J 

I 

with a similar definition for G r i g ht- Below we will only consider the 'left' version. 

We are lead to expand the commutators 

i M hj, h JleSt ® hj] = (/i//ij left ) ®hj- h Jlcit <g> (hjhj) 

(o /nj = 0, J ci 



(3.9) = |J|- 1/2 



'left 



^^left ® ^^right 1 = ^ight 



-V2hj UIt ®hj-hj UIt ®h 1 J I = J 
k \/2 /ij ® /i j - /ij lcft ® /i/ J C J lcft . 

In this computation, we note that there are two conditions that lead to the commutator being 
zero. The first is a trivial localization condition, I D J — 0. The second condition, J C /, is 
an essential cancellation condition coming from commutator. 

All other terms lead to a paraproduct term, although some of these paraproducts are 
trivial, in that all relevant functions have a zero. Apply this computation to the commutator 
in question, expanding as 



[M 6 , G lcft ] = ^ h i) t M ^' h J^ ® h - 



For instance, from the case of I = J, we get 

(b,hj) 



Jlcft 
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which is a paraproduct operator with symbol b. Notice that the £ 1' falls on the right side of 
the tensor product of Haar functions. In considering the case I = Ji e f t , we get a paraproduct 
that is dual to the one above, namely 

/ ^ \f\j\ ft 

The other term that arises from the case I = J\ e f t is less singular: 

Here, all functions are Haar functions, that is they have zeros. This case is easier to control. 
Let us consider the case of / C J left . Observe that we have 



G icft h) — - ^2 V |7i h 



Keeping this in mind, we see that 



cj lRft vl J l / Vl J l 



That is, we have a composition of G* and a paraproduct. For the other term associated with 
this case we have 

This is dual to the previous case. Our proof is complete. 

□ 

3.10. Remark. A simpler exposition of this approach can be had for commutators of multipli- 
cation operators and fractional integral operators. See Lacey [40] . This provides an alternate 
proof of a result of Chanillo. This result proves another characterization of BMO in terms of 
a commutator. 



4. Aspects of Product Hardy Theory 



We describe the elements of product Hardy space theory, as developed by S.-Y. Chang 
and R. Fefferman [6,7,23-25] as well as Journe [35,36]. By this, we mean the Hardy spaces 
associated with domains like D ® D, with boundary T ® T. In particular, the boundary is 
flat, and while we work with several variables, we are very far from the pseudoconvex case. 
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We view M. d as a tensor product of one dimensional spaces. In particular, previously, we 
used the splitting of L 2 (M) = H 2 (C + ) © H 2 (C + ). This leads to a decomposition of 

d d 

L 2 (R d ) = (g) L 2 (R) = (g) H 2 {£ + ) © Hi(C+) , 

3=1 3=1 

into 2 d components. 

To describe them, let us set P±j to be the one dimensional Fourier projection operator P± 
acting on the jth coordinate. For a G {— , +} d , set 

d 

Likewise, we set if^(C+) to be the range of the orthogonal projection P CT . We then have 

o-e{+,-} d 

Among these 2 d Hardy spaces, we distinguish if^(C^) in which cr = +, and likewise for 
Hq(C+). The corresponding orthogonal projections are P® and P e . 

Functions / in this space are defined on M, d . R d is viewed as the boundary of the 'upper 
half space' 

d 

c + = g c : Re ^) > °^ 

i=i 

And we require that there is a function F : Cl — 5> C that is holomorphic in each variable 
separately, and 

/(x) = lim f(x 1 + iy 1 ,...,x d + iy d ). 
lls/IHo 

The norm of / is taken to be 

H/llH&(ci) = lim---lim||F(xi + y u . . . , x d + y d )\\ L i {Rd) 

4.1. Remark. The (real) Hardy space H 1 (R d ) typically denotes the class of functions with 
the norm 

d 
3=1 

where Rj denote the Reisz transforms. This space is invariant under the one parameter 
family of isotropic dilations, while if 1 (C^_) is invariant under dilations of each coordinate 
separately. That is, it is invariant under a d parameter family of dilations. That is why we 
refer to 'multiparameter' theory, or 'd parameters.' 
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As before, the real H 1 , Reif 1 (C^_) has a variety of equivalent norms, in terms of square 
functions, maximal functions and Hilbert transforms. For our discussion of paraproducts, it 
is appropriate to make some definitions of translation and dilation operators which extend 
the definitions in (2.2) — (2.4). (Indeed, here we are adopting broader notation than we really 
need, in anticipation of a discussion of multiparameter paraproducts.) Define 

(4.2) Tr y f(x-y) = f(x-y), yeR d , 

(4.3) Dil£ ./Tr, r,,) ^ (t 1 ---t d y 1 ^f(x 1 /t 1 ,...,x d /t d ), t h ...,t d >0 

def 



(4-4) DiP R = Tr^Dil^^,. 

In the last definition R — Ri x • • ■ x R d is a rectangle, and the dilation incorporates the 
locations and scales associated with R. c(R) is the center of R. 

Let D d = T> x ■ • • x D denote the d fold product of the dyadic intervals. These are the 
dyadic rectangles in M. d . For a non negative bump function ip l with J tp 1 dx = 1, define the 
(strong) maximal function by 

M-..M/(x) = sup DilW(x)(f,Di\W) 

R&V d 

We use the superscript on ip l to indicate that it has a non zero integral. 
Fix a bump function ip° so that 

d 

<p°(x u ...,x d ) = Y[<p(xj) 

i=i 

where J R ip dx = 0. Then set an analog of the Littlewood Paley square function to be 

1/2 



S---S/(x)= ^[Dil^°(x)] 2 |(/,Dil^ 



0\ 12 



R&V d 

4.5. Theorem. All of the norms below are equivalent, and can be used as a definition of real 
ReH\C d + ). 

HM---M/IU, IIS---S/IK, H p -/Hi> 

ae{a,i} d 

d d 

E E lin^/IL- 

3=1 Aj&Wj} j=l 

In the last line, we are summing over all choices of operators Aj being either the identity 
operator, or Hj, the Hilbert transform computed in the jth direction. 
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4.1. BMO(C^). The dual of the real Hardy space is ReH 1 (C^.)* = BMO(C^), the d-fold 
product BMO space. It is a Theorem of S.-Y. Chang and R. Fefferman [7] that this space 
has a characterization in terms of the product Carleson measure introduced above. We need 
the product wavelet basis. For a rectangle R = Ylj=i ^ ^ d se ^ 

d 

w R (x 1} ...,x d ) = Y[wR U) (xj) = DH%W [0)1 ]d(x) 

3=1 

The basis {wr : R G T> d } is the d-fold tensor product of the wavelet basis. We use the 
same notation wr and vr for the Meyer wavelet basis, and the analytic Meyer wavelet basis. 
Define 

(4-6) ||6||bmo(r-) - sup \\U\~ 1 V|(6,^)| 2 1 1/2 

where we have replaced the Haar wavelets by the Meyer wavelets on the right. 

It is the Theorem of Chang and Fefferman that 
4.7. Theorem. We have the equivalence of norms 

l|/||(Reffi(C^))* — ll/l|BMO(E d ) 

That is, BMO(E d ) is the dual to ReH\C d + ). 

To define analytic BMO(C+), it suffices to replace the Meyer wavelets above by the analytic 
Meyer wavelets. 

4.2. Journe's Lemma. The explicit definition of BMO in (4.6) is quite difficult to work 
with. In the first place, it is not an intrinsic definition, in that one needs some notion of 
wavelet to define it. Secondly, the supremum is over a very broad class of objects: All subsets 
of M d of finite measure. There are simpler definitions, (that unfortunately are not intrinsic) 
that in particular circumstances are sufficient. 

For our purposes, there are two appropriate definitions. Set ||/||BMO(rec) to be the supremum 
in (4.6), but with the important restriction that the sets U are taken to be rectangles. 
Historically, this was the first natural guess for the correct definition of BMO(C^). But, 
in a key moment, L. Carleson [5] produced examples of functions which acted as linear 
functionals on H 1 (C'^) with norm one, yet had arbitrarily small BMO(rec) norm. This 
example is recounted at the beginning of R. Fefferman's article [25]. 

Despite this fact, Journe Lemma shows that in certain circumstances, the rectangular 
BMO(rec) norm can dominate the BMO(C^) norm. Let us state this Lemma in the case of 
before moving to the more sophisticated variants that we will need in three and higher 
parameters. 
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Given a set U C M 2 of finite measure, let 



Emb(i?; U) = sup{/i > 1 : C V } 



V = {MM1, i > i> 

1 {MMl c/ >0 2J 



This is defined for rectangles R C U, where denotes the rectangle with the same center 
as R, which is dilated by a factor of fi in all directions. Notice that we have \V\ < \U\, 
and that V is a natural 'dilate of U.' The function Emb(i?; U) is a measure of how deeply 
embedded R is inside of U. 

A key distinction in two and higher parameters concerns collection of rectangles {R : R C 
U, Emb(i2; U) ~ This collection of rectangles is not pairwise disjoint, but their overlap 
is, in appropriate sense, at worst logarithmic in /i. A formulation of this principle is easiest 
in two parameters. 

4.8. Lemma (Journe's Lemma [35] in M 2 ). For any function f, and e > 0, we have the 
inequality below valid for all sets [/cK 2 of finite measure. 



The implied constant depends only on e > 0. 

Notice that the last inequality is that the BMO norm is dominated by the (generally 
smaller) BMO(rec) norm. Carleson's examples show that this inequality is false if we do not 
'dampen' the wavelet coefficients in some way. Journe's insight is that this can be done with 
the geometric notion of the enlargement term. 

We will need this observation in the case of C^. But, the rectangular norm is ill suited 
to our needs in three and higher dimensions. We make this definition, which reduces to 
rectangular BMO in dimension 2. 

Say that a collection of rectangles U C T> d has d — 1 parameters iff there is a choice of 
coordinate j so that for all R,R'eU we have R^ = R'^y that is the jth coordinate of the 
rectangles agree. 

We then define 



A collection of rectangles has a shadow given by sh(W) = [J{R ■ R E U}. Observe that in 
d — 2 this reduces to the rectangular BMO definition. We use the —1 subscript to indicate 
that we have 'lost one parameter' in the definition. 




Rcu 



(4.9) 




parameters 
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The extension of Journe's Lemma that we need replaces the BMO(rec) norm by this 
BMO_i(C[) norm. Yet one more refinement is essential for our needs, that the 'dilate' of 
the set U be taken with considerably more care, and in particular should be just a little bit 
bigger than U in measure, see (4.12). 

4.10. Lemma (Journe's Lemma in d — 1 parameters). For all t] > 0, and collections of 
rectangles U whose shadow has finite measure, we can construct V D sh(W) and a function 
Emb : U — > [l,oo) so that 

(4.11) Emb(R) ■ R C V, ReU, 

(4.12) \V\ < (l + 77)|sh(f/)|, 

(4.13) \\Y,Vmb(R;U)- 2d (f,w R )w R 

RcU 

The last inequality holds for all functions f , with the constant K v depending only on rj. 



, Js < K V \\J llBMO- 
BMO( 



Notice that the power on the embeddedness term in (4.13) is quite large, twice the number 
of parameters. Also, concerning the conclusions, if we were to take Emb(i2) = 1, then 
certainly the first conclusion (4.11) would be true. But, the last conclusion would be false 
for the Carleson examples in particular. This choice is obviously not permitted in general. 

The formulations of Journe's Lemma given here are not the typical ones found in Journe's 
original Lemma, or J. Pipher's extension to three dimensional case. These papers give the 
more geometric formulation of these Lemmas, and J. Pipher's article implicitly contains the 
geometric formulation needed to prove the Lemma above (provided one is satisfied with the 
estimate \V\ < |sh(W)|). See Pipher [60]. Lemma 4.10, as formulated above, was found in 
Lacey and Terwelleger [38]; the two dimensional variant (which is much easier) appeared in 
Lacey and Ferguson [27]. The paper of Cabrelli, Lacey, Molter and Pipher [4] is a compre- 
hensive survey of issues related to Journe's Lemma. See in particular Sections 2 and 4. We 
refer the reader to it for more information on this subject. 



5. Multiparameter Paraproducts 



We now consider paraproducts formed over sums of dyadic rectangles in R d . Let us say 
that a function <p is adapted to a rectangle R = ®y =1 Rj iff <f(xi, . . . , Xd) = ETjLi Pj( x j)i 
with each tpj adapted to the interval Rj in the sense of (2.17). 

Our paraproducts are of the same general form 

Ren l-^l 2 v=i 
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Here, we let TZ = V d be the class of dyadic rectangles. 

The Theorem in this setting is 

5.1. Theorem. Let 1 < p < oo, and J C {l,...,d}. Assume that for each choice of 
coordinate 1 < j < d, and v = 1 

(5.2) / ip v ,R(xi,X2, • • • , x n ) dxj = 0, for all Xk with k ^ j and all R . 
Jr 

In addition, for each 1 < j < d and all R G T> d , assume that the condition above holds for 
<p Vj R } where v = 2 if j G J and v — 3 if j ^ J. Then, we have the inequality 

(5.3) B : BMO(C^) x L p — ► L p . 

We are not stating this result in greatest generality. It was first discussed in the the paper 
of Journe [36]. Recently, the result has received new attention, and extension. See Muscalu, 
Pipher, Tao and Thiele [44,45]. Our discussion is drawn from Lacey and Metcalfe [37]. And 
in particular, this last paper proves this Theorem. 

The critical distinction comes from the assumption about the zeros, (5.2). We need zeros 
in every coordinate on the functions that land on the BMO(C^_) function. There is one more 
zero in each coordinate, and they can be split up between the second and third functions. 

Notice that there are many different types of paraproducts. The first case, with the greatest 
similarity to the one parameter case, is where we have, for example, Xj zeros in first and second 
positions for all 1 < j < d. The other cases do not have a proper analog in the one parameter 
case. 

We will have need of paraproducts which are presented in a somewhat different way, in 
analogy to Theorem 2.23. We make some definitions. For s*G Z d , let us set 

A Uj = ^ U R ® U R ■ 

Rev d 

\Rs\=2i* , l<s<d 

For a subset of coordinates J C {1, . . . , d} set 

U T ,j^ £ AU, 

ks — j s i S^zJ 

For those coordinates s G J, we take the wavelet projection onto that scale, while for those 
coordinates s ^ J, we sum over all larger scales. 

Write R' <j R iff \R' S \ < \R S \ for s g J and \R' S \ = \R S \ for s G J. 
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5.4. Theorem. For all J C {1, . . . , d}, and k E Z d with \\k\\oo < 8, we have 
(5-5) E( AU ^)-U^ <HI 2 ~ ll fc llBMO(C^)ll^l|2 

Moreover, suppose we have the following separation condition: Fix an integer A > 0. Suppose 
that 

(5.6) if (b, us?) ^ 0, (<p, u R ) ^ with R' <j R, then ARC\R' = 0. 

We then have the estimate 

(5-7) ||^(AU jV 6).lJ-^^|| 2 <A- 10M ||6|| BMO{cd+) ||^|| 2 

fez d 

Implied constants are independent of the choice of k. 

Proof. The method of proof is quite similar to that of Theorem 2.23. 

We treat a special case with a brute force approach. Consider 

^(AUj6) • (AIJ— ^) , k E Z d , II^IU < 8. 

We claim that the two estimates of the Theorem hold for these operators. 

Fix an integer B > 2 Let tt : T> d — > T> d be a map so that for all R E T> d we have 
2 fcs |i? s | = |7r(i2) s | for 1 < s < d. In addition, the distance between R and tt(R) is essentially 
constant. Namely, 

B- 1 < M • ■ ■ M 1 r (c(tt(R)) <(B- ly 1 

(In the current setting, it is most natural to use the maximal function to measure distances.) 
Then, the sum 

\^ (b,U R )- r— 

n P | \f^ u 7r(R)/ ' V\ R \ U R U *(R) 
Rev" V\ R \ 

is a paraproduct, with zeros in all coordinates for both b and </?. This is not necessarily the 
case of the third place, but B 2md ^\R\ ur u^Fm is adapted to R with constant independent 
of B. To prove (5.5), we then sum over B > 1; to prove (5.6), sum over B > A. 

We recall the 'father wavelet' W from the proof of Theorem 2.23. For a subset of coordi- 
nates J C {1, . . . , d} we set 

Wr,j{xi, ...,x d ) = Y[ u Rs (x s ) Yl W Rs {%s) 

thus, in the coordinates in J we take an analytic Meyer wavelet, and for those coordinates 
not in J we take a father wavelet. 
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Observe that 

1 oP m . 



Rev d 

|K S |=2^ 



For 9 = (9, . . . , 9), we need to provide the two bounds for the Theorem for 



For an integer A, and map tt as above, it suffices to consider the sum 

-fx 



ReT> d 

def n — r 
^fl,J = V\ R \ U R- ^tt(R),J ■ 

This is a paraproduct. Note that for the function b, we have zeros in all coordinates; for 
the function ip, we have zeros in coordinates s ^ J; the function has zeros in those 
coordinates s e J. Finally, B 2 i/j^j is adapted to R with constants that are independent 
of B or the choice of it. Thus, the two claimed inequalities of the Theorem hold for these 
sums, and this completes the proof of the Theorem. 

□ 

6. Nehari Theorem in Several Variables 
The Hankel operators we are are concerned with are maps from if^(C^) to H^(CV) given 

by 

(6.1) H fe ^ d =i: f P ffi M^. 
This definition only depends upon P ffi b. 

6.2. Remark. These are the 'little' Hankel operators, in that we are taking the 'smallest' 
reasonable projection above. To define the 'big' Hankel operators, one would replace P e 
above by I — P ffi . We refer the reader to Cotlar and Sadosky [17] for the theory of these 'big' 
Hankel operators. 

The Nehari Theorem in this context is: 
Multiparameter Nehari Theorem ([27,38]). We have the equivalence 
(6-3) ||H 6 || ~ ||P©&||bmo(c^) 

where the latter space is S.-Y. Chang and R. Fefferman BMO, the dual to the Hardy space 
H\C d + ). 
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This theorem has equivalent statements; the most obvious of these concerns the multipa- 
rameter commutator 

C(b,f) d = f [•••[M 6 ,H 1 ],...,H d ] 
where FL, denotes the Hilbert transform computed in the jth coordinate. 

Less obviously, there is an an equivalent formulation in terms of factorization. As we have 
commented, the classical factorization of H 1 functions given in Proposition 2.11 does not 
extend to if 1 (C^). The Nehari theorem is equivalent to weak factorization. The formalization 
of this is done in terms of a tensor products of i7 2 (C^). 

We define a projective tensor product norm by 

\\f\\H*(c*)§>W{c*.) = inf {^lbJi^(cj)IHI^(c^) : / = X1^' Vi^i E ^ 2 ( C +)} 

3 3 

6.4. Theorem. Any one of equivalences of norms below are consequences of the other equiv- 
alences. 

(6.5) ||H 6 || ~ ||Pe6|| B MO(Rd) 

(6-6) l|Cfe|| 2 ->2 - II^IIbmo(c^) 

(6-7) ||/||b-i(c|) — ll/llir 2 (c|)§if 2 (crf) 

In the first two, we take the BMO(M d ) norm to be real valued BMO. In the second two, 
BMO(C^.) is analytic BMO. 

The last equivalence of norms is the weak factorization statement in if 1 (C^_). It explains 
in part why the factorization proof of the one parameter Nehari theorem is so easy: The 
factorization property is stronger than Nehari's Theorem. 

6.8. Corollary. We have 

||H 6 || = infdl/^IU : P (B b = P (B P}. 

Theorem 6.4 was known and elementary; once weak factorization (6.7) is known, Corol- 
lary 6.8 is easy. Thus, the Multiparameter Nehari Theorem is the main point, x The 
inequality ||H&|| < ||&||bmo(c[)> t urns out to be quite easy — it is a consequence of the trivial 
inclusion in the weak factorization statement. The issue is to establish the lower bound on 
the norm of the Hankel operator. 

The central difficulty here lies in the subtle nature of BMO in the higher parameter case. 
The proof we give is an induction on d, using weak factorization in if 1 (C!jr 1 ) in a critical 
moment. Appealing to weak factorization will give us a lower bound in terms of BMO_i. 
And so we need to 'bootstrap' from this weaker inequality to the stronger inequality. The 
boostrapping argument appeals to the Journe Lemma. 
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It suffices to assume that b = P e b e BMO(C^) is of norm one, and find an absolute lower 
bound on ||i?&||. We begin by using the induction hypothesis to establish 

\\Hb\\ <; II^IIbmo_i(c^)' 

where the latter norm is BMO norm 'with one less parameter' defined in (4.9). Thus, we are 
free to impose the additional hypothesis that ||&||bmo_i(c^) i s ^ ess than some fixed, absolute 
constant. Observe that implicitly, this forces b to be the type of functions which Carleson 
discovered. 

Yet, Journe's Lemma gives modest sufficient conditions for this impoverished norm to 
dominate the true BMO norm. The lower bound for the norm of Hf, can then be explicitly 
estimated as a main term, plus several error terms. Each of the error terms is a paraproduct, 
which can be controlled with Journe's Lemma and the fact that the improvised norm is small. 

Proof of Theorem 6.4- We discuss the proof of Theorem 6.4 and Corollary 6.8. Observe that 
the computation (3.1) is quite general. In the language we have introduced above, it shows 
immediately that 

(6-9) ||H 6 || ~ ||Pe&llH2( C d)§.ff2( C d). 

That is, the Hankel norms are equivalent to the dual norm of the tensor product norm. 
The equivalence of (6.5) and (6.7) is then immediate. 

Concerning the commutator, and (6.6), as in the one parameter case, the commutator is 
seen to be a sum of 2 d Hankel operators. Indeed, for a G { — , +} d , consider the composition 
Cfe Po- In the definition of the commutator, we are free to replace the jth Hilbert transform Hj 
by P_ ct (j) i j, since Hj = ±(I — 2 P-o-yy), and the identity commutes with everything. Thus, 

C 6 P CT = ±2 <i P_ <T M 6 P CT . 

From this, the equivalence of (6.5) and (6.6) is immediate. 

□ 



Proof of Corollary 6.8. We can assume that the symbol of the Hankel operator Ht> is in 
analytic BMO. Then, (6.9) and (6.7) show that b defines a bounded linear functional on 
i/^C^) C L x (]R d ). Appeal to the Hahn Banach Theorem to extend this linear functional to 
all of L 1 (IR d ), with the same norm. The Corollary follows. □ 
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7. Proof of Multiparameter Nehari Theorem 



The upper bound on the norm of a Hankel operator is easy. Observe that, trivially, 

H 2 {£ d + )®H 2 (£ d + ) c H\C d + ). 

For the dual spaces, we have the reverse inclusion. In particular, the BMO(C^) norm is 
larger than the dual tensor product norm. Thus, by (6.3), 

l|Hfe|| — ||P® fr||#2( C d)§^2( C d), 

~ II P® ^IIbmo(c^) 

Thus, the primary difficulty is in establishing the lower bound on the norm of the Hankel 
operator. 



7.1. The Initial Lower Bound. The proof is by induction on dimension d, and we take 
the classical Nehari Theorem as the base case in the induction. Thus, we assume that (6.3) 
holds in dimension d — 1 > 1, and prove it in dimension d. 

Take b to be in analytic BMO(C+), and of norm one. We recall that this means in 
particular, that we have 




where we recall that the supremum is over all subsets U of finite measure, and that the 
functions vr are the analytic Meyer wavelets associated to dyadic rectangles in C+. 

Let us argue that 
( 7 - 2 ) l|H&|| > ||&||bmo,,_i(C!*.) 

This last norm is given in (4.9), and in particular, it is a supremum as in (7.1), with an 
additional restriction on rectangles that contribute to that sum. 

Now, this inequality we are to prove, by (6.9), reduces to showing 

(7-3) ||P® &||(tf2( C d)§#2( C d)). > ||P® fr||BMO d _i(C^) • 

We can assume that b = P ffi b is a Schwartz function, and that ||fc||BMO d -i(c|) = 1- Thus, after 
a permutation of coordinate and a possible dilation, we can take a collection of rectangles U 
which achieves the supremum in the BMOrf_i(C^) norm. 

In particular, we can assume that 

• \sh{U)\ = 1 ; 

• there is an interval / of length one so that for all R G U we have Ri = I; 
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• for ij> = Y.neu( b i v R) v R we have ( b ^) = 1 - 

Then, it suffices to see that ||'0||j^(c d )®.H' a (C d ) ^ 1- 

Write x = (x\, x 2 , ■ ■ ■ , xj) E M d as (xi,x') with x' = (x 2 , . . . , Xd) E M d_1 . Each rectangle 
R ElA has the same first coordinate. So the first coordinate in the in product that defines the 
Meyer analytic wavelet Vr is independent of R. Therefore, we can write ip(x) = ipi(xi)ijj'(x') 
where "01 ( x i) E -fT 1 (M) is of norm one. It can written as ip\ = a ■ (3 with ati and fli of H 2 (R) 
norm one. 

ip' satisfies something similar. Observe that 

II^V(C d -)<l^ 1/2 M2<l. 

Hence, tp' is in ff^Cf 1 ), and is of norm at most one. In fact, it has norm comparable to one, 
since by construction H^'llBMOfC* -1 ) — anc ^ (0'>V' / ) = 1- Thus, by the induction hypothesis, 
we have 

H^'llifiOC^- 1 ) — 1 1 V 7 ' 1 1 ija (c^r 1 ) ® if 2 (c^r 1 ) — 1 • 
Thus, ip' can be written as a sum of products of a'j ■ (3j with 

y^ll Q! .fllg2fc d - 1 Jl^llg2fc d - 1 l — 1- 

3 

But then, it is clear that we can write 

ip(xi,x') = ajx^a'^x') ■ (3{xi)(3'j(x') 

3 

and so ||V||Ha(c-)g^«(c-) ~ l - 

7.2. The BMO(C+) lower bound. Our task is to 'bootstrap' from the weaker inequality 
(7.2). Namely, for an absolute constant rj-i whose value is to be specified, it suffices to 
consider Hankel symbols b which satisfy b = P ffi b; b is Schwartz function; ||^||BMO(c d ) = lj 
and ||fc||BMO_i(c d ) < V-i- (The subscript _i mimics our notation for the reduced parameter 
BMO space.) 

We show by direct computation that ||H&|| > 1, namely we will apply the Hankel operator 
to a particular if 2 (C^) function, and provide a lower bound on the norm of the image. 

Here is how we select the test function to apply the Hankel to. Select a collection of 
rectangles U which achieve the supremum in the definition of BMO(C^) norm. Thus, 

£|<M B )| a = |sh(W)|. 

Reu 
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Moreover, we can, after taking an appropriate dilation, that |sh(Z/)| = 1, and that if R C 
sh(W), then R e U. 

The function we apply the Hankel to the wavelet projection of b onto the wavelets associated 
with U, a — ^ R&u {biV R )v R . Observe that 

||H 6 a|| = ||P©|a| 2 || 2 



|2 



2^^rT 1r 



ReU 



> 



1/2 



1 . 



ReU 

Here, we are relying on the symmetry of the Fourier transform of positive functions; Little- 
wood Paley inequalities, to pass to the wavelet square function; that U has shadow equal to 
one in measure, and that L 4 norms dominate L 2 norms on a probability space. Thus, we 
have ||Hq, ck|| > r/o > 0, for absolute 770- 

This is in fact our main estimate. Our task is to show that for sufficiently small, that 
we have 

(7.4) ||H 6 _ a a|| < \ri . 
This can be done with the aid of Journe's Lemma. 

Fix a second small parameter 7/j whose value will be specified below. (The subscript j is for 
'Journe.') Apply Lemma 4.10. There is a set V D sh(W) and a function Emb : U — > [1, 00) 
for which these conditions hold. 

• \v\ <i + m; 

• Emb{R)R C V for all ReU; 

• II«IIbmo(C£) < K vjV-i 

where in the last line, we have 

(7.5) 5 = J2 Emh (R)~ 2d (b,v R )v R . 

Reu 

We now decompose the symbol b. We have already defined a. Set 

(7.6) (3 ^ ^{b t v R )v R . 

RCV 
R$U 
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Thus, these are the rectangles with are 'close' to U, but not in it, as defined by the set V. 
Define 7 by b = a + (3 + 7. To verify (7.4), it suffices to show that 

(7.7) PHI <Kr ] 1 /\ 

(7.8) ||H 7 a\\ < K ni r\-\ . 

One then specifies rjj so that the top line is no more than The constant K VJ that appears 
in the second line is absolute, so we can then fix r]-i sufficiently small to prove (7.4). 



The inequality for (3 is easily available to us, by the particular form of the Journe Lemma 
we are using. Observe first that 

i+£KM*>l 2 = £l< 6 ' **>I 2 <i + i&- 

RcV RCV 
RgU 

Therefore, \\(3\\2 < On the other hand, the BMO(C^) norm of (3 is less than or equal 

to one. Thus, we have ||/3||4 < ?7 1//4 . A Hankel operator is at worst a product, thus 

PHI < Il/3||4||a||4<^j 1/4 . 

So it remains to verify (7.8). 



An Initial Calculation. We make an explicit computation of a Hankel operator, in a manner 
similar to (3.9). Namely, restricting attention to one dimension, we have 

(0 8| J\ < \I\ 

(7.9) E VI vj = P+( Vl vj) = I P + (vjvj) \I\ <8\.J\< 64|7| 

yvfuj \i\ < 8\j\ . 

This follows from the Fourier localization properties of the Meyer wavelet. The Fourier 
support of the product vjvj is given by the convolution of the Fourier supports, which are 
specified by (2.10). If J is much smaller than 7, the product is purely antianalytic, giving us 
the first case above. In the third case, vfUJ is purely analytic. 



We apply the observation above to the term H 7 a. This leads us to the conclusion that 

ll H 7«ll = J) £ (b,u R ) (<p,u R )u R u R ~, 

(R,R')eA 

A={{R,R!):RdU,R'(^V, \R' S \ < 6A\R S \ , 1 < s < d} . 

It is essential to observe that this last sum can be written as a finite sum of the paraproducts 
in Theorem 5.4, applied to the functions a and 7. This sum varies of choices of k with 
\k\ < 6, and arbitrary J C {l,...,d}. (The subset J consists of those coordinates s for 
which \R S \ = 2 k °\R' s \.) 
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We use Theorem 5.4 to provide an estimate of the L 2 norm of the sum above an absolute 
constant times r/_i. In particular, we want to use the more technical estimate (5.7) to achieve 
this end. 

We will need to decompose the collection A into appropriate parts to which this estimate 
applies. That is the purpose of this definition. For an integer n > 1, take 

def 



a, 



^2 (b,u R )u R 



RcU 
2"- 1 <Enl(i?;C/)<2 n 

We claim that 

(7.10) ||H 7 a n || <2~Vi- 

It follows from Lemma 4.10 that we have the estimate 
(7.H) KHbmo ( c-) <2 2 Vi, 

indeed, this is the point of this definition. From other parts of the expansion of the Hankel 
operator, we need to find some decay in n. 

Nevertheless, from this estimate and the upper bound on Hankel operator norms, we have 
the estimate 

ll H 7«nll ~ IHIbMO(C[)IKI|2 < 2 2 *Vl- 

We use this estimate for n < 20, say. 

For n > 20, R e U with 2 n ~ l < En\(R;U) < 2 n , and rectangle R' with (R,R') G A, it 
follows that we must have 2 n ~ 9 R n R' — 0. That is, (5.6) is satisfied with the value of A in 
that display being A ~ 2™ for n > 20. Thus, we conclude that 

||H 7 a rt || <2- 50 V!, n>20. 

This completes our proof of (7.10), and the proof of the lower bound on the norm of Hankel 
operators. 
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